NBIS ID: SMS_6198
Report Version: 1.0
Request by: Mona N. Högberg ()
Principal Investigator: Mona N. Högberg ()
Organisation: SLU
NBIS Staff: Juliana Assis ()


1 Setup

## LIBRARIES
library("dada2")
library("devtools")
library("dplyr")
library("ggplot2")
library("microbiome")
library("phangorn") 
library("phyloseq") 
library("Rcpp")
library("reshape2")
library("tidyr")
library("vegan")
library("ShortRead")
library("Biostrings")
library("DECIPHER")
library("SensusR")
library("gplots")
library("gridExtra")
library("grid")
library("ggpubr")
library("reshape2")
library("reshape")
library("lulu")
library("ggrepel")
library("ggh4x")
library("RColorBrewer")
library("rITSx")
library("MicEco")
library("agricolae")
library("ampvis2")
library("pheatmap")
library("tidyverse")

2 Version

1.0

  • Support Request Request sent by the user:
    Mona N. Högberg

3 Data

96 samples

  • Type of data

ITS amplicon

  • Data location rackham.uppmax.uu.se /proj/snic2022-22-352

  • Uppmax project ID SNIC 2022/22-352

  • NGI Project ID P9723

  • Database used Unite

4 Tools

NFCore-Ampliseq (Dada2)

#Sample Info
head(sample_info_tab)

5 Workflow

I’m describing here the most important steps in the analysis. Moreover, it is possible to find all parameters and figures at the .rmd file.

5.0.1 Inspect read quality profiles

Quality profiles of the forward reads:

QC Foward Reads.

QC Foward Reads.

In gray-scale is a heat map of the frequency of each quality score at each base position. The median quality score at each position is shown by the green line, and the quartiles of the quality score distribution by the orange lines. The red line shows the scaled proportion of reads that extend to at least that position

The forward reads are good quality I truncated the forward reads at position 240.

Quality profile of the reverse reads:

QC Reverse Reads.

QC Reverse Reads.

The reverse reads are of significantly worse quality, especially at the end, which is common in Illumina sequencing. This isn’t too worrisome, as DADA2 incorporates quality information into its error model which makes the algorithm robust to lower quality sequence, but trimming as the average qualities crash will improve the algorithm’s sensitivity to rare sequence variants. Based on these profiles, I truncated the reverse reads at position 200 where the quality distribution crashes.

5.0.2 Identify primers

The universal primer set 5.8S and ITS1F (Fierer et al., 2005; Yarwood et al., 2010) primers were used to amplify this dataset. We record the DNA sequences, including ambiguous nucleotides, for those primers.

FWD <- “CTTGGTCATTTAGAGGAAGTAA”
REV <- “GCTGCGTTCTTCATCGATGC”

Sanity check of primers and adapters

##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0

Success! Primers are no longer detected in the cutadapted reads.

The primer-free sequence files are now ready to be analyzed through the DADA2 pipeline.

5.0.3 Learn the Error Rates

The DADA2 method relies on a parameterized model of substitution errors to distinguish sequencing errors from real biological variation. Because error rates can (and often do) vary substantially between sequencing runs and PCR protocols, the model parameters can be discovered from the data itself using a form of unsupervised learning in which sample inference is alternated with parameter estimation until both are jointly consistent.

In order to verify that the error rates have been reasonably well-estimated, we inspect the fit between the observed error rates (black points) and the fitted error rates (black lines) in the next Figure:

Error Rate Foward Reads.

Error Rate Foward Reads.

Error Rate Foward Reads.

Error Rate Foward Reads.

Everything looks reasonable and we proceed with confidence.

After filtering, we moved to the next step:

Sequence inference: The DADA2 sequence inference step removed - nearly - all substitution and indel errors from the data. We now merge together the inferred forward and reverse sequences, removing paired sequences that do not perfectly overlap as a final control against residual errors.

5.0.4 Construct sequence table and remove chimeras

The DADA2 method produces a sequence table valued by the number of times each sequence was observed in each sample.

Notably, chimeras have not yet been removed. The error model in the sequence inference algorithm does not include a chimera component, and therefore we expect this sequence table to include many chimeric sequences. We now remove chimeric sequences by comparing each inferred sequence to the others in the table, and removing those that can be reproduced by stitching together two more abundant sequences.

The frequency of chimeric sequences varies substantially from dataset to dataset, and depends on on factors including experimental procedures and sample complexity.

sum(seqtab.nochim)/sum(seqtab)
## [1] 0.950831

5.0.5 Inspect distribution of sequence lengths:

As expected, quite a bit of length variability in the the amplified ITS region.

Distribution of sequence lengths.

Distribution of sequence lengths.

## null device 
##           1

5.0.6 Track reads through the pipeline

We now inspect the the number of reads that made it through each step in the pipeline to verify everything worked as expected.

Track Reads.

Track Reads.

## null device 
##           1

Looks good! We kept the majority of our raw reads. The most impacting was the filter step.

5.0.7 Taxa prevalence versus total counts.

Taxa prevalence.

Taxa prevalence.

Each point is a different taxa. Exploration of the data in this way is often useful for selecting filtering parameters, like the minimum prevalence criteria we will used to filter the data above.

In this study, I removed taxa not seen more than 2 times in at least 1% of the samples. This protects against an ASV with small mean & trivially large C.V.

Count prevalence
- Kingdom = 1
Phylum = 15
Class = 53
Order = 128
—- Family = 232
—– Genus = 458
—— ASVs = 551

5.1 Replicates Evaluation

In this step an evaluation of the replicates has been performed. The main goal was to understand if they are any significant difference between the replicates A and B.

Alpha Diversity Plot comparing the Replicates A and B.

Quality control analysis using matched samples from 3 different Transects: A, B and C of the experiment and replicates samples on each Ecotype. Comparison of alpha diversity in technical replicates samples on all Ecotypes from each Transect. ASV richness and ASV Shannon diversity.

Alpha Diversity.

Alpha Diversity.

Evaluation of alpha diversity: looking for difference at the count read distribution

Beta Diversity Plot comparing the Replicates A and B.

Quality control analysis using matched samples from replicates A and B. Beta diversity using Jensen-Shannon distance

Beta Diversity Distance

Beta Diversity Distance

Beta Diversity Plot comparing the Replicates A and B.

Beta Diversity Replicates.

Beta Diversity Replicates.

## null device 
##           1
Beta Diversity Replicates by Type.

Beta Diversity Replicates by Type.

## null device 
##           1

In summary: there is no difference between A and B samples. In this way we choose working the replicate A.

Rarefaction
Rarefaction.

Rarefaction.

Rarefaction.

Rarefaction.

Rarefaction.

Rarefaction.

5.2 Main Analysis

  • Alpha Diversity Plot - A.

Raw alpha-diversity, i.e. without rarefying

  • comparing alpha diversity based on raw data has the huge problem of ignoring sample size differences. Usually there is a trend that more reads correlate with higher diversity.

  • Here, therefore I checked, whether there are significant sample size differences between the groups.

  • We further look at the residuals of a linear fit alpha diversity vs sample size to correct for this confounder.

  • Check whether sample_sums/sample sizes/library sizes differ between groups

  • sample size adjustment has no influence on richness, so sample_sizes are compared for raw counts

AlphaDiversity Main Analysis.

AlphaDiversity Main Analysis.

Alpha Diversity = BoxPlot

AlphaDiversity Main Analysis.

AlphaDiversity Main Analysis.

#https://grunwaldlab.github.io/analysis_of_microbiome_community_data_in_r/07--diversity_stats.html
anova_result <- aov(df2$Observed ~ Transect * Ecotype, data = df2)
summary(aov(df2$Observed ~ Transect * Ecotype, data = df2))
##                  Df Sum Sq Mean Sq F value        Pr(>F)    
## Transect          2   1715     858   0.733      0.488718    
## Ecotype           4 151580   37895  32.400 0.00000000017 ***
## Transect:Ecotype  8  51828    6478   5.539      0.000243 ***
## Residuals        30  35087    1170                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
aresult <- HSD.test(anova_result, "Ecotype", group = TRUE)

Plot by mean

AlphaDiversity Main Analysis.

AlphaDiversity Main Analysis.

  • Beta Diversity Plot - A.
gpca  <- ordinate(Plots, "MDS")
# Scree plot
plot_scree(gpca, "Scree Plot for MDS Analysis")

#Remove ASVs that do not show appear more than 5 times in more than half the samples
genefilter = genefilter_sample(P_Plots, filterfun_sample(function(x) x > 5), A=0.5*nsamples(P_Plots))
genefilter_tax = prune_taxa(genefilter, P_Plots)

#Transform to even sampling depth.
genefilter_tax_t = transform_sample_counts(genefilter_tax, function(x) 1E6 * x/sum(x))

genefilter_tax_t.ord <- ordinate(genefilter_tax_t, "NMDS")
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1177618 
## Run 1 stress 0.1177616 
## ... New best solution
## ... Procrustes: rmse 0.0001617719  max resid 0.0009389746 
## ... Similar to previous best
## Run 2 stress 0.1177617 
## ... Procrustes: rmse 0.000219552  max resid 0.001280429 
## ... Similar to previous best
## Run 3 stress 0.1345524 
## Run 4 stress 0.1177616 
## ... Procrustes: rmse 0.000008377792  max resid 0.00003453982 
## ... Similar to previous best
## Run 5 stress 0.136347 
## Run 6 stress 0.1177616 
## ... New best solution
## ... Procrustes: rmse 0.00005481706  max resid 0.0003228071 
## ... Similar to previous best
## Run 7 stress 0.117762 
## ... Procrustes: rmse 0.0002667922  max resid 0.001550909 
## ... Similar to previous best
## Run 8 stress 0.1492259 
## Run 9 stress 0.1177616 
## ... New best solution
## ... Procrustes: rmse 0.000007245489  max resid 0.00002199715 
## ... Similar to previous best
## Run 10 stress 0.1161722 
## ... New best solution
## ... Procrustes: rmse 0.04932887  max resid 0.2692485 
## Run 11 stress 0.1177617 
## Run 12 stress 0.1403949 
## Run 13 stress 0.1177616 
## Run 14 stress 0.1177616 
## Run 15 stress 0.1494246 
## Run 16 stress 0.1151887 
## ... New best solution
## ... Procrustes: rmse 0.01123477  max resid 0.06391121 
## Run 17 stress 0.1177616 
## Run 18 stress 0.1177616 
## Run 19 stress 0.1177616 
## Run 20 stress 0.1509306 
## *** No convergence -- monoMDS stopping criteria:
##     18: stress ratio > sratmax
##      2: scale factor of the gradient < sfgrmin
#
genefilter_tax_t@sam_data$Transect <- factor(genefilter_tax_t@sam_data$Transect, levels = c(1,2,3), labels = c("1","2", "3"))

genefilter_tax_t@sam_data$Ecotype <- factor(genefilter_tax_t@sam_data$Ecotype, levels = c("Meadow","Alder zone","Spruce-Alder", "Spruce","Pine"), labels = c("Meadow","Alder","Spruce-Alder", "Spruce","Pine"))

#
ord_DataFrame_tx <- plot_ordination(genefilter_tax_t, genefilter_tax_t.ord, "samples", justDF ="TRUE")

#
ordplot_tx <- (ordplot <- plot_ordination(genefilter_tax_t, genefilter_tax_t.ord, "samples", color="Ecotype", shape = "Transect") + 
               geom_hline(yintercept = 0, linetype = 'dashed', alpha = 0.3) + 
               geom_vline(xintercept = 0, linetype = 'dashed', alpha = 0.3) +
               geom_point(size = 5, color = "grey") +
               scale_shape_manual(values=c(23,21,24)) +
               theme_pubr(border = TRUE) +
               coord_fixed(ratio = 1) +
               theme(axis.text=element_text(size=14), 
                     axis.text.x = element_text(size = 12, hjust = 0.5), 
                     axis.title.y = element_text(size = 18),
                     legend.text=element_text(size=14), 
                     legend.title=element_text(size=0),
                     legend.position="bottom",
                     axis.title.x = element_text(size = 18), 
                     strip.text.x = element_text(size = 20, face = "bold"))) +
  scale_color_manual(values = cols_Ecotype)#+
ordplot_tx$layers <- ordplot_tx$layers[-1]
ordplot_tx + 
  geom_point(data=ord_DataFrame_tx, aes(x = NMDS1, y = NMDS2, fill =Ecotype, size = 4))  +
  scale_fill_manual(values = cols_Ecotype) 

pord2 = (ordplot <- plot_ordination(genefilter_tax_t, genefilter_tax_t.ord, "samples", color="Ecotype", shape = "Transect") + 
               geom_hline(yintercept = 0, linetype = 'dashed', alpha = 0.3) + 
               geom_vline(xintercept = 0, linetype = 'dashed', alpha = 0.3) +
               geom_point(size = 5, color = "grey") +
               scale_shape_manual(values=c(23,21,24)) +
               geom_polygon(aes(fill=Ecotype, alpha = 0.8)) + 
               theme_pubr(border = TRUE) +
               coord_fixed(ratio = 1) +
               theme(axis.text=element_text(size=14), 
                     axis.text.x = element_text(size = 12, hjust = 0.5), 
                     axis.title.y = element_text(size = 18),
                     legend.text=element_text(size=14), 
                     legend.title=element_text(size=0),
                     legend.position="bottom",
                     axis.title.x = element_text(size = 18), 
                     strip.text.x = element_text(size = 20, face = "bold"))) +
  scale_color_manual(values = cols_Ecotype)

pord2$layers <- pord2$layers[-1]
pord2 + 
  geom_point(data=ord_DataFrame_tx, aes(x = NMDS1, y = NMDS2, fill =Ecotype, size = 4))  +
  scale_fill_manual(values = cols_Ecotype) 

  • Beta Diversity Plot - B.

Ecotype

  • Permutational analysis of variance

HeatMap

#Heatmap 
meta_data_2 <- read.table("/Users/juliana/Documents/NBIS/Projects/6198/Metadado/metadata_copynumber_sampleA.tsv", header=T, row.names=1, check.names=T, sep="\t")
sample_data(P_Plots) <- meta_data_2

P_Plots@sam_data$CopyNumber2 <- as.numeric(P_Plots@sam_data$CopyNumber)


#Remove ASVs that do not show appear more than 3 times in more than 30% the samples
genefilter2 = genefilter_sample(P_Plots, filterfun_sample(function(x) x > 3), A=0.3*nsamples(P_Plots))
genefilter_tax2 = prune_taxa(genefilter2, P_Plots)
#MetaData
metaR <- meta(genefilter_tax2)

# Reorder Transect
metaR$Transect = factor(metaR$Transect, levels = c("1", "2","3")) #D46C4E
metaR_Transect <- c("#C0C0C0","#BC8F8F","#F5DEB3") #D46C4E #43978D
names(metaR_Transect) <- levels(metaR$Transect)

# Reorder Ecotype
metaR$Ecotype <- factor(metaR$Ecotype, levels = c("Meadow","Alder zone","Spruce-Alder", "Spruce","Pine"), labels = c("Meadow","Alder","Spruce-Alder", "Spruce","Pine"))
metaR_Ecotype <- c("#77A515", "#264D59", "#43978D", "#D46C4E", "#2b8cbe")
names(metaR_Ecotype) <- levels(metaR$Ecotype)

# Add to a list, where names match those in factors dataframe
metaR_AnnColour <- list(
  Transect = metaR_Transect)

metaR_AnnColour2 <- list(
  Ecotype = metaR_Ecotype)

metaR_AnnColourx <- list(
  Transect = metaR_Transect,
  Ecotype = metaR_Ecotype)

# Check the output
metaR_AnnColour
## $Transect
##         1         2         3 
## "#C0C0C0" "#BC8F8F" "#F5DEB3"
SampleOrder = order(metaR$Ecotype, metaR$Transect)
meta.factors <- select(metaR, Transect, Ecotype)


metaR_Filter_composi.filt.abs <- P_Plots
#Subset #Multiplicando pela soma das reads, normalizando
metaR_Filter_composi.filt.ab <- transform_sample_counts(genefilter_tax2, function(x)100*x/sum(x))

###
{
  for(n in 1:nsamples(metaR_Filter_composi.filt.ab))
    
    otu_table(metaR_Filter_composi.filt.abs)<- otu_table(metaR_Filter_composi.filt.ab)*sample_data(metaR_Filter_composi.filt.ab@sam_data)$CopyNumber [n]
}

abs_plot <- data.frame(otu_table(metaR_Filter_composi.filt.abs))

#Colocando ASV names na OTU table
rownames(abs_plot)
##  [1] "ASV1"   "ASV3"   "ASV4"   "ASV5"   "ASV7"   "ASV8"   "ASV11"  "ASV13" 
##  [9] "ASV14"  "ASV18"  "ASV21"  "ASV22"  "ASV25"  "ASV29"  "ASV31"  "ASV34" 
## [17] "ASV37"  "ASV39"  "ASV46"  "ASV48"  "ASV52"  "ASV55"  "ASV56"  "ASV57" 
## [25] "ASV62"  "ASV73"  "ASV82"  "ASV83"  "ASV84"  "ASV85"  "ASV90"  "ASV99" 
## [33] "ASV100" "ASV106" "ASV111" "ASV125" "ASV129" "ASV130" "ASV132" "ASV135"
## [41] "ASV138" "ASV140" "ASV153" "ASV155" "ASV162" "ASV163" "ASV170" "ASV174"
## [49] "ASV176" "ASV187" "ASV197" "ASV202" "ASV206" "ASV208" "ASV210" "ASV212"
## [57] "ASV218" "ASV220" "ASV222" "ASV229" "ASV237" "ASV248" "ASV265" "ASV271"
## [65] "ASV273" "ASV278" "ASV290" "ASV292" "ASV293" "ASV295" "ASV299" "ASV311"
## [73] "ASV316" "ASV319" "ASV341" "ASV347" "ASV349" "ASV352" "ASV362" "ASV376"
## [81] "ASV383" "ASV386" "ASV392" "ASV416" "ASV423" "ASV429" "ASV443" "ASV453"
## [89] "ASV467" "ASV479" "ASV546" "ASV572" "ASV597" "ASV610" "ASV633" "ASV778"
## [97] "ASV779" "ASV787"
taxonomy_otu_compositional_copy <- data.frame(tax_table(metaR_Filter_composi.filt.abs))
tax_m <- taxonomy_otu_compositional_copy$taxa_name
rownames(abs_plot) <- tax_m
abs_plot
##Plot
plot_Log10_max <- log10(abs_plot)/max(log10(abs_plot) +1)
#plot_Log10_max <- log10(abs_plot +1)
plot_Log10_max[plot_Log10_max == "-Inf"] <- 0

colsHeat<- c("#F7F7F7", "#92C5DE", "#0571B0", "#F4A582", "#CA0020")

pheatmap(plot_Log10_max[, SampleOrder],
         cluster_cols = FALSE,
         cluster_rows = TRUE,
         gaps_row = 5, 
         clustering_distance_rows = "euclidean",
         clustering_distance_cols  = "euclidean",
         annotation_colors = metaR_AnnColourx, annotation_col = meta.factors, 
         show_colnames = FALSE,
         color = colorRampPalette(c("white", colsHeat))(50),
         border_color = "#f8edeb",
         display_numbers = FALSE)

meta.factorsT <- select(metaR, Transect)
meta.factorsE <- select(metaR, Ecotype)
Taxa_DataFrame <- data.frame(tax_table(metaR_Filter_composi.filt.abs))
Taxa_DataFrame_F <- Taxa_DataFrame %>% select(1,2,3,4,5,6,10)
Taxa_Matrix <- as.matrix(Taxa_DataFrame_F)

pseq_plot_heat <- metaR_Filter_composi.filt.abs

rown <- rownames(sample_data(pseq_plot_heat))
sample_data(pseq_plot_heat)$SampleID <- rown
sample_data(pseq_plot_heat) <- (sample_data(pseq_plot_heat)[, c(14,1,2,3,4,5,6,7,8,9,10,11,12,13)])

tax_table(pseq_plot_heat) <- as.matrix(Taxa_Matrix)

count_f <- data.frame(otu_table(pseq_plot_heat))
meta_f <- data.frame(sample_data(pseq_plot_heat))
taxa_f <- data.frame(tax_table(pseq_plot_heat))
names(taxa_f)[7] <- "Species"


d <- amp_load(
  otutable = count_f,
  metadata = meta_f,
  taxonomy = taxa_f
)
## Warning: Could not find a column named OTU/ASV in otutable, using rownames as
## OTU ID's
## Warning: Could not find a column named OTU/ASV in taxonomy, using rownames as
## OTU ID's
teste <-  amp_heatmap(
  d,
  group_by = c("Transect"),
  facet_by = c("Ecotype"),
  normalise = TRUE,
  #tax_add = "OTU",
  tax_aggregate = "Species",
  tax_show = 100,
  showRemainingTaxa = FALSE,
  tax_class = NULL,
  tax_empty = "best",
  plot_values = FALSE,
  plot_values_size = 3,
  plot_legendbreaks = NULL,
  plot_colorscale = "log10",
  plot_na = TRUE,
  measure = "mean",
  #sort_by = "Transect 1",
  min_abundance = 0.01,
  max_abundance = NULL,
  normalise_by = NULL,
  scale_by = NULL,
  color_vector = colorRampPalette(c("white", colsHeat))(50),
  round = 1,
  textmap = FALSE,
  plot_functions = FALSE,
  function_data = FALSE,
  functions = c("MiDAS", "Filamentous", "AOB", "NOB", "PAO", "GAO"),
  rel_widths = c(0.75, 0.25)
)
## Warning: The data has already been normalised. Setting normalise = TRUE (the
## default) will normalise the data again and the relative abundance information
## about the original data of which the provided data is a subset will be lost.
## Warning: There are only 98 taxa, showing all
print(teste)
## Warning: Transformation introduced infinite values in discrete y-axis

  • Relative abundance of the top 20 ASVs

6 Summary

Help is needed with running the “nfcore/ampliseq” pipeline developed at NGI for the analyses of fungal ITS1 amplicons, Illumina Miseq analysis NGI project ID P5953 (M.Hogberg_17_01_project summary from 2018-01-19 by Chuan Wang refers to P9723, >=Q30 (mean(SD), 70(2) (%), Sum Reads=15 650 000, Mean reads per sample 171 711, 1 pool of amplicons, 1 flowcell v3, PE 2x301bp (validated method), demultiplexing, quality control and raw data delivery on Uppmax (validated method). Agreement number M.Hogberg_16_01-20160826. Grus delivery project delivery 00654. Because my support application 2019-01-17 was rejected, I have collaborated with partners in US on this matter but all is extremely delayed for known reasons. I got some hope today when I read about the recently developed pipeline for fungal analyses! Unfortunately, I have no programming skills but have a BSc in Molecular Biology.

Short summary of the work.

7 Further Work

Further steps to be taken (if needed).

8 References

Relevant references for methods, tools etc.

9 Deliverables

Files delivered to the user with descriptions.

9.1 Directory

/data/processed/b/

8 directories, 18 files

Total size is XX GB.

10 Timeline

11 Practical Info

11.1 Data responsibility

The responsibility for data archiving lies with the PI of the project. We do not offer long-term storage or retrieval of data.

  • NBIS & Uppnex: We kindly ask that you remove the files from UPPMAX/UPPNEX. The main storage at UPPNEX is optimized for high-speed and parallel access, which makes it expensive and not the right place for longer time archiving. Please consider others by not taking up the expensive space. Please note that UPPMAX is a resource separate from the Bioinformatics Platform, administered by the Swedish National Infrastructure for Computing (SNIC) and SNIC-specifc project rules apply to all projects hosted at UPPMAX.
  • Sensitive data : Please note that special considerations may apply to the human-derived legally considered sensitive personal data. These should be handled according to specific laws and regulations as outlined e.g. here.
  • Long-term backup : We recommend asking your local IT for support with long-term data archiving. Also a newly established Data Office at SciLifeLab may be of help to discuss other options.

11.2 Acknowledgments

If you are presenting the results in a paper, at a workshop or conference, we kindly ask you to acknowledge us.

  • NBIS staff are encouraged to be co-authors when this is merited in accordance to the ethical recommendations for authorship, e.g. ICMJE recommendations. If applicable, please include Juliana, Assis Geraldo, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, NBIS, as co-author. In other cases, NBIS would be grateful if support by us is acknowledged in publications according to this example:

“Support by NBIS (National Bioinformatics Infrastructure Sweden) is gratefully acknowledged.”

“The computations were performed on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Project SNIC 2022-22-352.”

  • NGI : For publications based on data from NGI Sweden, NGI, SciLifeLab and UPPMAX should be acknowledged like so:

“The authors would like to acknowledge support from Science for Life Laboratory (SciLifeLab), the National Genomics Infrastructure (NGI), and Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) for providing assistance in massive parallel sequencing and computational infrastructure.”

12 Support Completion

You should soon be contacted by one of our managers with a request to close down the project in our internal system and for invoicing matters. If we do not hear from you within 30 days the project will be automatically closed and invoice sent. Again, we would like to remind you about data responsibility and acknowledgements, see sections: Data Responsibility and Acknowledgments.

You are welcome to come back to us with further data analysis request at any time via http://nbis.se/support/support.html.

Thank you for using NBIS.